Overview

Dataset statistics

Number of variables4
Number of observations1247
Missing cells0
Missing cells (%)0.0%
Duplicate rows38
Duplicate rows (%)3.0%
Total size in memory388.0 KiB
Average record size in memory318.6 B

Variable types

Text3
Categorical1

Alerts

Dataset has 38 (3.0%) duplicate rowsDuplicates

Reproduction

Analysis started2024-02-06 17:05:58.144966
Analysis finished2024-02-06 17:06:09.226954
Duration11.08 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

Nome
Text

Distinct870
Distinct (%)69.8%
Missing0
Missing (%)0.0%
Memory size79.1 KiB
2024-02-06T14:06:09.565165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length7.723336
Min length3

Characters and Unicode

Total characters9631
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique554 ?
Unique (%)44.4%

Sample

1st rowBulbasaur
2nd rowIvysaur
3rd rowVenusaur
4th rowCharmander
5th rowCharmeleon
ValueCountFrequency (%)
galarian 12
 
0.9%
indeedee 7
 
0.6%
silicobra 5
 
0.4%
sandaconda 5
 
0.4%
darmanitan 5
 
0.4%
morgrem 4
 
0.3%
tapu 4
 
0.3%
rapidash 4
 
0.3%
corsola 4
 
0.3%
eiscue 4
 
0.3%
Other values (860) 1214
95.7%
2024-02-06T14:06:10.143147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 918
 
9.5%
e 805
 
8.4%
o 769
 
8.0%
r 734
 
7.6%
i 687
 
7.1%
n 590
 
6.1%
l 568
 
5.9%
t 444
 
4.6%
u 388
 
4.0%
s 322
 
3.3%
Other values (51) 3406
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8325
86.4%
Uppercase Letter 1270
 
13.2%
Space Separator 21
 
0.2%
Other Punctuation 7
 
0.1%
Dash Punctuation 5
 
0.1%
Other Symbol 2
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 918
11.0%
e 805
 
9.7%
o 769
 
9.2%
r 734
 
8.8%
i 687
 
8.3%
n 590
 
7.1%
l 568
 
6.8%
t 444
 
5.3%
u 388
 
4.7%
s 322
 
3.9%
Other values (17) 2100
25.2%
Uppercase Letter
ValueCountFrequency (%)
S 170
13.4%
C 106
 
8.3%
M 91
 
7.2%
G 88
 
6.9%
P 86
 
6.8%
D 81
 
6.4%
T 77
 
6.1%
B 67
 
5.3%
A 59
 
4.6%
F 54
 
4.3%
Other values (16) 391
30.8%
Other Punctuation
ValueCountFrequency (%)
. 3
42.9%
: 2
28.6%
' 2
28.6%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
21
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9595
99.6%
Common 36
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 918
 
9.6%
e 805
 
8.4%
o 769
 
8.0%
r 734
 
7.6%
i 687
 
7.2%
n 590
 
6.1%
l 568
 
5.9%
t 444
 
4.6%
u 388
 
4.0%
s 322
 
3.4%
Other values (43) 3370
35.1%
Common
ValueCountFrequency (%)
21
58.3%
- 5
 
13.9%
. 3
 
8.3%
: 2
 
5.6%
' 2
 
5.6%
1
 
2.8%
1
 
2.8%
2 1
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9625
99.9%
None 4
 
< 0.1%
Misc Symbols 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 918
 
9.5%
e 805
 
8.4%
o 769
 
8.0%
r 734
 
7.6%
i 687
 
7.1%
n 590
 
6.1%
l 568
 
5.9%
t 444
 
4.6%
u 388
 
4.0%
s 322
 
3.3%
Other values (48) 3400
35.3%
None
ValueCountFrequency (%)
é 4
100.0%
Misc Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%

Tipo
Text

Distinct209
Distinct (%)16.8%
Missing0
Missing (%)0.0%
Memory size99.0 KiB
2024-02-06T14:06:10.399327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length10.092221
Min length3

Characters and Unicode

Total characters12585
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)4.0%

Sample

1st rowPlanta, Venenoso
2nd rowPlanta, Venenoso
3rd rowPlanta, Venenoso
4th rowFogo
5th rowFogo
ValueCountFrequency (%)
água 181
 
9.6%
normal 150
 
8.0%
planta 148
 
7.9%
voador 136
 
7.2%
psíquico 123
 
6.5%
inseto 113
 
6.0%
fogo 96
 
5.1%
fada 91
 
4.8%
lutador 89
 
4.7%
venenoso 87
 
4.6%
Other values (15) 671
35.6%
2024-02-06T14:06:10.829669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 1587
 
12.6%
a 1533
 
12.2%
r 932
 
7.4%
e 694
 
5.5%
t 690
 
5.5%
, 638
 
5.1%
638
 
5.1%
n 580
 
4.6%
l 504
 
4.0%
s 469
 
3.7%
Other values (28) 4320
34.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9424
74.9%
Uppercase Letter 1885
 
15.0%
Other Punctuation 638
 
5.1%
Space Separator 638
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1587
16.8%
a 1533
16.3%
r 932
9.9%
e 694
7.4%
t 690
7.3%
n 580
 
6.2%
l 504
 
5.3%
s 469
 
5.0%
u 446
 
4.7%
d 363
 
3.9%
Other values (11) 1626
17.3%
Uppercase Letter
ValueCountFrequency (%)
P 318
16.9%
F 266
14.1%
V 230
12.2%
N 202
10.7%
Á 181
9.6%
I 113
 
6.0%
T 97
 
5.1%
L 89
 
4.7%
E 88
 
4.7%
D 82
 
4.4%
Other values (5) 219
11.6%
Other Punctuation
ValueCountFrequency (%)
, 638
100.0%
Space Separator
ValueCountFrequency (%)
638
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11309
89.9%
Common 1276
 
10.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1587
14.0%
a 1533
13.6%
r 932
 
8.2%
e 694
 
6.1%
t 690
 
6.1%
n 580
 
5.1%
l 504
 
4.5%
s 469
 
4.1%
u 446
 
3.9%
d 363
 
3.2%
Other values (26) 3511
31.0%
Common
ValueCountFrequency (%)
, 638
50.0%
638
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12086
96.0%
None 499
 
4.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 1587
13.1%
a 1533
12.7%
r 932
 
7.7%
e 694
 
5.7%
t 690
 
5.7%
, 638
 
5.3%
638
 
5.3%
n 580
 
4.8%
l 504
 
4.2%
s 469
 
3.9%
Other values (23) 3821
31.6%
None
ValueCountFrequency (%)
Á 181
36.3%
í 123
24.6%
é 87
17.4%
ã 82
16.4%
ç 26
 
5.2%
Distinct187
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size130.0 KiB
2024-02-06T14:06:11.122390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length34
Mean length24.806736
Min length11

Characters and Unicode

Total characters30934
Distinct characters51
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)2.9%

Sample

1st rowRaio Solar, Veneno Ácido
2nd rowRaio Solar, Veneno Ácido
3rd rowRaio Solar, Veneno Ácido
4th rowChama, Investida de Fogo
5th rowChama, Investida de Fogo
ValueCountFrequency (%)
investida 764
 
16.6%
de 527
 
11.4%
raio 370
 
8.0%
pedra 225
 
4.9%
surf 174
 
3.8%
confusão 165
 
3.6%
solar 119
 
2.6%
vento 113
 
2.4%
cortante 112
 
2.4%
psíquico 106
 
2.3%
Other values (59) 1940
42.0%
2024-02-06T14:06:11.555536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3368
 
10.9%
a 2899
 
9.4%
o 2802
 
9.1%
e 2353
 
7.6%
i 1931
 
6.2%
d 1817
 
5.9%
n 1539
 
5.0%
r 1421
 
4.6%
t 1292
 
4.2%
, 1247
 
4.0%
Other values (41) 10265
33.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22278
72.0%
Uppercase Letter 4041
 
13.1%
Space Separator 3368
 
10.9%
Other Punctuation 1247
 
4.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2899
13.0%
o 2802
12.6%
e 2353
10.6%
i 1931
8.7%
d 1817
8.2%
n 1539
 
6.9%
r 1421
 
6.4%
t 1292
 
5.8%
s 1216
 
5.5%
v 814
 
3.7%
Other values (20) 4194
18.8%
Uppercase Letter
ValueCountFrequency (%)
I 764
18.9%
S 542
13.4%
P 470
11.6%
C 433
10.7%
R 370
9.2%
V 246
 
6.1%
D 240
 
5.9%
Á 164
 
4.1%
G 149
 
3.7%
A 140
 
3.5%
Other values (9) 523
12.9%
Space Separator
ValueCountFrequency (%)
3368
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1247
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26319
85.1%
Common 4615
 
14.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2899
 
11.0%
o 2802
 
10.6%
e 2353
 
8.9%
i 1931
 
7.3%
d 1817
 
6.9%
n 1539
 
5.8%
r 1421
 
5.4%
t 1292
 
4.9%
s 1216
 
4.6%
v 814
 
3.1%
Other values (39) 8235
31.3%
Common
ValueCountFrequency (%)
3368
73.0%
, 1247
 
27.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30235
97.7%
None 699
 
2.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3368
 
11.1%
a 2899
 
9.6%
o 2802
 
9.3%
e 2353
 
7.8%
i 1931
 
6.4%
d 1817
 
6.0%
n 1539
 
5.1%
r 1421
 
4.7%
t 1292
 
4.3%
, 1247
 
4.1%
Other values (33) 9566
31.6%
None
ValueCountFrequency (%)
ã 216
30.9%
Á 164
23.5%
í 106
15.2%
â 94
13.4%
ô 80
 
11.4%
ó 25
 
3.6%
á 11
 
1.6%
é 3
 
0.4%

Geração
Categorical

Distinct8
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size80.2 KiB
Quarta
432 
Quinta
155 
Primeira
151 
Terceira
135 
Oitava
113 
Other values (3)
261 

Length

Max length8
Median length6
Mean length6.4811548
Min length5

Characters and Unicode

Total characters8082
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrimeira
2nd rowPrimeira
3rd rowPrimeira
4th rowPrimeira
5th rowPrimeira

Common Values

ValueCountFrequency (%)
Quarta 432
34.6%
Quinta 155
 
12.4%
Primeira 151
 
12.1%
Terceira 135
 
10.8%
Oitava 113
 
9.1%
Segunda 100
 
8.0%
Sétima 89
 
7.1%
Sexta 72
 
5.8%

Length

2024-02-06T14:06:11.715634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T14:06:11.868716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
quarta 432
34.6%
quinta 155
 
12.4%
primeira 151
 
12.1%
terceira 135
 
10.8%
oitava 113
 
9.1%
segunda 100
 
8.0%
sétima 89
 
7.1%
sexta 72
 
5.8%

Most occurring characters

ValueCountFrequency (%)
a 1792
22.2%
r 1004
12.4%
t 861
10.7%
i 794
9.8%
u 687
 
8.5%
e 593
 
7.3%
Q 587
 
7.3%
S 261
 
3.2%
n 255
 
3.2%
m 240
 
3.0%
Other values (9) 1008
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6835
84.6%
Uppercase Letter 1247
 
15.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1792
26.2%
r 1004
14.7%
t 861
12.6%
i 794
11.6%
u 687
 
10.1%
e 593
 
8.7%
n 255
 
3.7%
m 240
 
3.5%
c 135
 
2.0%
v 113
 
1.7%
Other values (4) 361
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
Q 587
47.1%
S 261
20.9%
P 151
 
12.1%
T 135
 
10.8%
O 113
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 8082
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1792
22.2%
r 1004
12.4%
t 861
10.7%
i 794
9.8%
u 687
 
8.5%
e 593
 
7.3%
Q 587
 
7.3%
S 261
 
3.2%
n 255
 
3.2%
m 240
 
3.0%
Other values (9) 1008
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7993
98.9%
None 89
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1792
22.4%
r 1004
12.6%
t 861
10.8%
i 794
9.9%
u 687
 
8.6%
e 593
 
7.4%
Q 587
 
7.3%
S 261
 
3.3%
n 255
 
3.2%
m 240
 
3.0%
Other values (8) 919
11.5%
None
ValueCountFrequency (%)
é 89
100.0%

Missing values

2024-02-06T14:05:58.397412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-06T14:06:09.175738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

NomeTipoHabilidadesGeração
0BulbasaurPlanta, VenenosoRaio Solar, Veneno ÁcidoPrimeira
1IvysaurPlanta, VenenosoRaio Solar, Veneno ÁcidoPrimeira
2VenusaurPlanta, VenenosoRaio Solar, Veneno ÁcidoPrimeira
3CharmanderFogoChama, Investida de FogoPrimeira
4CharmeleonFogoChama, Investida de FogoPrimeira
5CharizardFogo, VoadorChama, Bico PerfurantePrimeira
6SquirtleÁguaSurf, Jato de ÁguaPrimeira
7WartortleÁguaSurf, Jato de ÁguaPrimeira
8BlastoiseÁguaSurf, Jato de ÁguaPrimeira
9CaterpieInsetoInvestida, Pó VenenosoPrimeira
NomeTipoHabilidadesGeração
1237MorgremNoturno, FadaPunho Sombrio, Desejo MisteriosoOitava
1238PerrserkerAçoAsas de Ferro, Corte FerozOitava
1239CopperajahMetal, ElétricoCorte Feroz, Choque do TrovãoOitava
1240FalinksLutadorSoco Dinâmico, InvestidaOitava
1241PincurchinElétricoChoque do Trovão, Esfera AuraOitava
1242CursolaFantasmaBola Sombria, ConfusãoOitava
1243RunerigusFantasma, TerraBola Sombria, TerremotoOitava
1244StonjournerRochaPedra Afiada, InvestidaOitava
1245EiscueGeloRaio de Gelo, TumultoOitava
1246IndeedeePsíquico, NormalConfusão, DesejoOitava

Duplicate rows

Most frequently occurring

NomeTipoHabilidadesGeração# duplicates
26IndeedeePsíquico, NormalConfusão, DesejoOitava6
0AppletunPlanta, DragãoRaio Solar, Garra DragônicaOitava3
1ApplinPlanta, DragãoRaio Solar, Garra DragônicaOitava3
2ArrokudaÁguaSurf, Aqua JetOitava3
3BoltentElétricoRaio Veloz, Choque do TrovãoOitava3
5CoalossalFogo, RochaChama, Pedra AfiadaOitava3
6CopperajahMetal, ElétricoCorte Feroz, Choque do TrovãoOitava3
8CramorantVoador, ÁguaSurf, Bico PerfuranteOitava3
9CursolaFantasmaBola Sombria, ConfusãoOitava3
10DrednawÁgua, RochaSurf, Investida de PedraOitava3